-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get hookedSAETransformer + tutorial working #125
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #125 +/- ##
==========================================
- Coverage 64.11% 63.90% -0.21%
==========================================
Files 17 17
Lines 1761 1773 +12
Branches 289 291 +2
==========================================
+ Hits 1129 1133 +4
- Misses 568 575 +7
- Partials 64 65 +1 ☔ View full report in Codecov by Sentry. |
Why is code being copy and pasted over from transformer lens? Is it not possible to just use |
@ArthurConmy This was a tough call for me but I think it's probably the right one. T-lens implements it's own SAE which has a few dissimilarities to ours and which might change / go out of sync. Part of this was that I tried to use the T-lens one with SAE Lens and it needed enough changes that don't make sense as PR's to T-lens that I thought it might be best to duplicate / get them in sync. However, this thinking is cached from a couple of weeks ago and my conviction is less strong now without more context. Possibly worth trying to enumerate the challenges and make changes to my SAE class of the HookedSAE transformer class such that this isn't necessary. |
@jbloomAus there is definitely a problem with too many SAEs classes. I think there probably should be one training SAE (likely the SAE in this codebase) and one inference SAE (used for circuit analysis and feature viz), but currently we have
Which seems too many (already, the fact I had to use 3) and 2) when doing steering vectors work sucked IMO) ETA: ping me if I should make this a new issue somewhere. |
I don't mind having the discussion here. My thoughts are:
|
@dtch1997 Thanks for this! We now have hooked SAE transformer. I appreciate you doing this and sorry how it turned out with duplicate work. Will try to find a way to make it up to you. |
Description
Updates Joseph's HookedSAETransformer to current codebase
Fixes # (issue)
Type of change
Please delete options that are not relevant.
Checklist:
You have tested formatting, typing and unit tests (acceptance tests not currently in use)
make check-ci
to check format and linting. (you can runmake format
to format code if needed.)Performance Check.
If you have implemented a training change, please indicate precisely how performance changes with respect to the following metrics:
Please links to wandb dashboards with a control and test group.